AITopics

2403.19283

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
(12 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Taouti, Souad, Cherroun, Hadda, Ziadi, Djelloul

Parallel Tree Kernel Computation

arXiv.org Artificial IntelligenceMay-12-2023

Tree kernels are fundamental tools that have been leveraged in many applications, particularly those based on machine learning for Natural Language Processing tasks. In this paper, we devise a parallel implementation of the sequential algorithm for the computation of some tree kernels of two finite sets of trees (Ouali-Sebti, 2015). Our comparison is narrowed on a sequential implementation of SubTree kernel computation. This latter is mainly reduced to an intersection of weighted tree automata. Our approach relies on the nature of the data parallelism source inherent in this computation by deploying the MapReduce paradigm. One of the key benefits of our approach is its versatility in being adaptable to a wide range of substructure tree kernel-based learning methods. To evaluate the efficacy of our parallel approach, we conducted a series of experiments that compared it against the sequential version using a diverse set of synthetic tree language datasets that were manually crafted for our analysis. The reached results clearly demonstrate that the proposed parallel algorithm outperforms the sequential one in terms of latency.

machine learning, natural language, rwta, (19 more...)

2305.07717

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
Africa > Middle East > Algeria > Laghouat Province > Laghouat (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Mignot, Ludovic, Ouardi, Faissal, Ziadi, Djelloul

New Linear-time Algorithm for SubTree Kernel Computation based on Root-Weighted Tree Automata

arXiv.org Artificial IntelligenceFeb-2-2023

Tree kernels have been proposed to be used in many areas as the automatic learning of natural language applications. In this paper, we propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel computation. First, we introduce a new class of weighted tree automata, called Root-Weighted Tree Automata, and their associated formal tree series. Then we define, from this class, the SubTree automata that represent compact computational models for finite tree languages. This allows us to design a theoretically guaranteed linear-time algorithm for computing the SubTree Kernel based on weighted tree automata intersection. The key idea behind the proposed algorithm is to replace DAG reduction and nodes sorting steps used in previous approaches by states equivalence classes computation allowed in the weighted tree automata approach. Our approach has three major advantages: it is output-sensitive, it is free sensitive from the tree types (ordered trees versus unordered trees), and it is well adapted to any incremental tree kernel based learning methods. Finally, we conduct a variety of comparative experiments on a wide range of synthetic tree languages datasets adapted for a deep algorithm analysis. The obtained results show that the proposed algorithm outperforms state-of-the-art methods.

artificial intelligence, machine learning, natural language, (20 more...)

2302.01097

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > District of Columbia > Washington (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

arXiv.org Artificial IntelligenceSep-5-2021

Structural Optimization Makes Graph Classification Simpler and Better

Wu, Junran, Li, Jianhao, Pan, Yicheng, Xu, Ke

In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models. Here, based on an optimization method, we investigate the feasibility of improving graph classification performance while simplifying the model learning process. Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees. In particular, we minimize the structural entropy of the transformed encoding tree to decode the key structure underlying a graph. This transformation is denoted as structural optimization. Furthermore, we propose a novel feature combination scheme, termed hierarchical reporting, for encoding trees. In this scheme, features are transferred from leaf nodes to root nodes by following the hierarchical structures of encoding trees. We then present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification. The tree kernel follows label propagation in the Weisfeiler-Lehman (WL) subtree kernel, but it has a lower runtime complexity $O(n)$. The convolutional network is a special implementation of our tree kernel in the deep learning field and is called Encoding Tree Learning (ETL). We empirically validate our tree kernel and convolutional network with several graph classification benchmarks and demonstrate that our methods achieve better performance and lower computational consumption than competing approaches.

dataset, graph, tree kernel, (13 more...)

2109.02027

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chrupała, Grzegorz, Alishahi, Afra

Correlating neural and symbolic representations of language

arXiv.org Artificial IntelligenceJun-5-2019

Analysis methods which enable us to better understand the representations and functioning of neural models of language are increasingly needed as deep learning becomes the dominant approach in NLP. Here we present two methods based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which allow us to directly quantify how strongly the information encoded in neural activation patterns corresponds to information represented by symbolic structures such as syntax trees. We first validate our methods on the case of a simple synthetic language for arithmetic expressions with clearly defined syntax and semantics, and show that they exhibit the expected pattern of results. We then apply our methods to correlate neural representations of English sentences with their constituency parse trees.

artificial intelligence, machine learning, natural language, (16 more...)

doi: 10.18653/v1/P19-1283

1905.06401

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.90)

Garg, Sahil, Steeg, Greg Ver, Galstyan, Aram

Stochastic Learning of Nonstationary Kernels for Natural Language Modeling

arXiv.org Machine LearningFeb-1-2018

Natural language processing often involves computations with semantic or syntactic graphs to facilitate sophisticated reasoning based on structural relationships. While convolution kernels provide a powerful tool for comparing graph structure based on node (word) level relationships, they are difficult to customize and can be computationally expensive. We propose a generalization of convolution kernels, with a nonstationary model, for better expressibility of natural languages in supervised settings. For a scalable learning of the parameters introduced with our model, we propose a novel algorithm that leverages stochastic sampling on k-nearest neighbor graphs, along with approximations based on locality-sensitive hashing. We demonstrate the advantages of our approach on a challenging real-world (structured inference) problem of automatically extracting biological models from the text of scientific papers.

artificial intelligence, machine learning, natural language, (19 more...)

1801.03911

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)

Kaljahi, Rasoul, Foster, Jennifer

Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study

arXiv.org Machine LearningDec-19-2017

Any-gram kernels are a flexible and efficient way to employ bag-of-n-gram features when learning from textual data. They are also compatible with the use of word embeddings so that word similarities can be accounted for. While the original any-gram kernels are implemented on top of tree kernels, we propose a new approach which is independent of tree kernels and is more efficient. We also propose a more effective way to make use of word embeddings than the original any-gram formulation. When applied to the task of sentiment classification, our new formulation achieves significantly better performance.

artificial intelligence, machine learning, natural language, (20 more...)

1712.07004

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
(2 more...)

AAAI ConferencesAug-3-2013

Fast Linearization of Tree Kernels over Large-Scale Data

Severyn, Aliaksei (University of Trento) | Moschitti, Alessandro (University of Tretno)

Convolution tree kernels have been successfully applied to many language processing tasks for achieving state-of-the-art accuracy. Unfortunately, higher computational complexity of learning with kernels w.r.t. using explicit feature vectors makes them less attractive for large-scale data.In this paper, we study the latest approaches to solve such problems ranging from feature hashing to reverse kernel engineering and approximate cutting plane training with model compression. We derive a novel method that relies on reverse-kernel engineering together with an efficient kernel learning method. The approach gives the advantage of using tree kernels to automatically generate rich structured feature spaces and working in the linear space where learning and testing is fast. We experimented with training sets up to 4 million examples from Semantic Role Labeling. The results show that (i) the choice of correct structural features is essential and (ii) we can speed-up training from weeks to less than 20 minutes.

fast linearization, large-scale data, tree kernel

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kimura, Daisuke, Kashima, Hisashi

Fast Computation of Subpath Kernel for Trees

arXiv.org Machine LearningJun-18-2012

The kernel method is a potential approach to analyzing structured data such as sequences, trees, and graphs; however, unordered trees have not been investigated extensively. Kimura et al. (2011) proposed a kernel function for unordered trees on the basis of their subpaths, which are vertical substructures of trees responsible for hierarchical information in them. Their kernel exhibits practically good performance in terms of accuracy and speed; however, linear-time computation is not guaranteed theoretically, unlike the case of the other unordered tree kernel proposed by Vishwanathan and Smola (2003). In this paper, we propose a theoretically guaranteed linear-time kernel computation algorithm that is practically fast, and we present an efficient prediction algorithm whose running time depends only on the size of the input tree. Experimental results show that the proposed algorithms are quite efficient in practice.

algorithm, artificial intelligence, machine learning, (13 more...)

1206.4642

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Zanzotto, Fabio Massimo, Dell'Arciprete, Lorenzo

Distributed Tree Kernels

arXiv.org Machine LearningJun-18-2012

In this paper, we propose the distributed tree kernels (DTK) as a novel method to reduce time and space complexity of tree kernels. Using a linear complexity algorithm to compute vectors for trees, we embed feature spaces of tree fragments in low-dimensional spaces where the kernel computation is directly done with dot product. We show that DTKs are faster, correlate with tree kernels, and obtain a statistically similar performance in two natural language processing tasks.

artificial intelligence, machine learning, natural language, (17 more...)

1206.4607

Country:

Europe (1.00)
North America > United States > Oregon (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)